Lag0s

Week Summary

Technology

Earth has captured a temporary 'second moon,' a small asteroid named 2024 PT5, which will orbit until November 2024.

Research indicates that larger AI chatbots are increasingly prone to generating incorrect answers, raising concerns about their reliability.

Meta's Chief Technical Officer discussed advancements in AR and VR technologies, particularly focusing on the Orion AR glasses.

The author reflects on their experience with Rust, proposing several changes to improve the language's usability and safety features.

The Tor Project and Tails OS have merged to enhance their efforts in promoting online anonymity and privacy.

OpenAI is undergoing leadership changes, with key executives departing amid discussions about restructuring and the company's future direction.

Git-absorb

The concept of critical mass explains how significant changes occur when a threshold of acceptance is reached, impacting technology and society.

WordPress.org has banned WP Engine from accessing its resources due to ongoing legal disputes, raising concerns about security for WP Engine customers.

PostgreSQL 17

Hotwire Native is a web-first framework that simplifies mobile app development, allowing developers to reuse HTML and CSS across platforms.

Radian Aerospace is progressing on a reusable space plane, completing ground tests and aiming for full-scale flights by 2028.

A groundbreaking diabetes treatment using reprogrammed stem cells has enabled a patient to produce insulin independently for over a year.

Apple is developing a new home accessory that combines features of the iPad, Apple TV, and HomePod, expected to launch in 2025.

SpaceX's Starlink service is set to surpass 4 million subscribers, reflecting rapid growth and significant revenue projections.

TinyJS is a lightweight JavaScript library that simplifies dynamic HTML element creation and DOM manipulation for developers.

Project aims to reduce AI toxicity across languages via translation.
This project takes on the challenge of reducing harmful content in AI across multiple languages by using translation to improve safety measures where direct data is lacking.
Hi Impact
AI Safety
Monday, March 11, 2024
Study finds LLMs less robust against malicious attacks and complex in detecting inappropriate replies.
Researchers in a recent study used a virtual team called 'Evil Geniuses' to test the safety of LLMs. They found that these AI agents are less robust against malicious attacks, provide more complex responses, and make it harder to detect inappropriate replies.
Hi Impact
AI Safety
METR partners with AI companies for safety tests amid concerns over reliability.
Beth Barnes' nonprofit METR is partnering with major AI companies like OpenAI and Anthropic to develop safety tests for advanced AI systems, a move echoed by government initiatives. The focus is on assessing risks such as AI autonomy and self-replication, though there's acknowledgment that safety evaluations are still in early stages and cannot guarantee AI safety. METR's work is seen as pragmatic, despite concerns that current tests may not be sufficiently reliable to justify the rapid advancement of AI technologies.
Hi Impact
METR Beth Barnes AI Safety
Google's Frontier Safety Framework aims to mitigate risks of advanced AI.
Google DeepMind introduced the Frontier Safety Framework to address risks posed by future advanced AI models. This framework identifies critical capability levels (CCLs) for potentially harmful AI capabilities, evaluates models against these CCLs, and applies mitigation strategies when thresholds are reached.
Hi Impact
Google DeepMind Frontier Safety Framework AI Safety
OpenAI forms Safety and Security Committee for new foundation model.
OpenAI formed a Safety and Security Committee after announcing the training of its new foundation model. This committee will be tasked with issuing recommendations to the board about actions to take as model capabilities continue to improve.
Hi Impact
OpenAI AI Safety
Anthropic introduces a Responsible Scaling Policy to enhance AI safety.
Anthropic's Responsible Scaling Policy aims to prevent catastrophic AI safety failures by identifying high-risk capabilities, testing models regularly, and implementing strict safety standards, with a focus on continuous improvement and collaboration with industry and government.
Hi Impact
Anthropic AI Safety
Friday, May 24, 2024
Anthropic introduces method to map and interpret Claude Sonnet LLM for safer AI.
Anthropic researchers have unveiled a method to interpret the inner workings of its large language model, Claude Sonnet, by mapping out millions of features corresponding to a diverse array of concepts. This interpretability could lead to safer AI by allowing specific manipulations of these features to steer model behaviors. The study demonstrates a significant step in understanding and improving the safety mechanisms of AI language models.
Hi Impact
Anthropic Claude Sonnet AI Safety
OpenAI forms a new Safety and Security Committee for its next AI model's risk management.
OpenAI has announced the formation of a new Safety and Security Committee to oversee risk management for its projects and operations. The company recently began training its next frontier model. The new Safety and Security Committee will be responsible for making recommendations about AI safety to the full company board of directors. It will be responsible for processes and safeguards related to alignment research, protecting children, upholding election integrity, assessing societal impacts, and implementing security measures.
Hi Impact
OpenAI AI Safety
Former OpenAI researcher Jan Leike joins Anthropic to lead AI safety team.
Jan Leike, a former OpenAI researcher who resigned over AI safety concerns, has joined Anthropic to lead a new "superalignment" team focusing on AI safety and security. Leike's team will address scalable oversight, weak-to-strong generalization, and automated alignment research.
Hi Impact
Anthropic Jan Leike AI Safety
Zico Kolter joins OpenAI's Board, bringing expertise in AI safety and robustness.
Zico Kolter, a Professor at Carnegie Mellon University and expert in AI safety and robustness, has joined OpenAI's Board of Directors and its Safety and Security Committee. His extensive research in AI safety, alignment, and model robustness will enhance OpenAI's efforts to ensure AI benefits humanity.
Hi Impact
OpenAI Zico Kolter AI Safety
MIT's AI Risk Repository offers a comprehensive database to assess and mitigate over 700 AI risks.
MIT and other institutions have launched the AI Risk Repository, a comprehensive database of over 700 documented AI risks, to help organizations and researchers assess and mitigate evolving AI risks using a two-dimensional classification system and regularly updated information.
Hi Impact
MIT AI Risk Repository AI Safety
California's SB-1047, requiring AI 'kill switches' for public safety, passes the State Assembly.
SB-1047 passed the California State Assembly by a 45-11 vote. It now faces just one more procedural state Senate vote before heading to the governor's desk. The bill asks AI model creators to implement a 'kill switch' that can be activated if a model starts introducing novel threats to public safety and security. It has been criticized for focusing on risks from an imagined future AI rather than real present-day harms like deepfakes or misinformation.
Hi Impact
United States
AI Safety
OpenAI and Anthropic to give US government early access to new AI models for safety evaluations.
OpenAI and Anthropic have agreed to allow the US government early access to their major new AI models before public release to enhance safety evaluations as part of a memorandum with the US AI Safety Institute.
Hi Impact
OpenAI, Anthropic United States AI Safety
Global Cooperation Urged for AI Safety
The article discusses the urgent need for global cooperation in ensuring the safety of artificial intelligence (AI) as it becomes increasingly powerful and potentially dangerous. Drawing parallels to the Pugwash Conferences that addressed nuclear weapons during the Cold War, the piece highlights a recent initiative called the International Dialogues on AI Safety, which brings together leading AI scientists from both China and the West. This initiative aims to foster dialogue and develop a consensus on AI safety as a global public good. The article emphasizes that the rapid advancements in AI capabilities pose existential risks, including the potential loss of human control and malicious uses of AI systems. To address these risks, the scientists involved in the dialogues have proposed three main recommendations: 1. **Emergency Preparedness Agreements and Institutions**: The establishment of an international body to facilitate collaboration among AI safety authorities is crucial. This body would help states agree on necessary technical and institutional measures to prepare for advanced AI systems, ensuring a minimal set of effective safety preparedness measures is adopted globally. 2. **Safety Assurance Framework**: Developers of frontier AI must demonstrate that their systems do not cross defined red lines, such as those that could lead to autonomous replication or the creation of weapons of mass destruction. This framework would require rigorous testing and evaluation, as well as post-deployment monitoring to ensure ongoing safety. 3. **Independent Global AI Safety and Verification Research**: The article calls for the creation of Global AI Safety and Verification Funds to support independent research into AI safety. This research would focus on developing verification methods that enable states to assess compliance with safety standards and frameworks. The piece concludes by underscoring the importance of a collective effort among scientists, states, and other stakeholders to navigate the challenges posed by AI. It stresses that the ethical responsibility of scientists, who understand the technology's implications, is vital in correcting the current imbalance in AI development, which is heavily influenced by profit-driven motives and national security concerns. The article advocates for a proactive approach to ensure that AI serves humanity's best interests while mitigating its risks.
Hi Impact
International Dialogues on AI Safety
AI Safety

Month Summary

Technology

OpenAI is considering a new subscription model for its upcoming AI product, Strawberry, while also restructuring for better financial backing.

Telegram founder

The startup landscape is shifting towards more tech-intensive ventures, with a focus on specialized research and higher capital requirements.

Boom Supersonic's XB-1 demonstrator aircraft successfully completed its second flight, testing new systems for future supersonic travel.

announced the uncrewed return of Boeing's Starliner, with future crewed missions planned for 2025.

OpenAI's SearchGPT aims to compete with Google Search by providing AI-driven information retrieval, though it currently faces accuracy issues.

Tesla is preparing to unveil its autonomous robotaxi technology at an event in Los Angeles, indicating ongoing challenges in achieving full autonomy.

The US Department of Justice is investigating Nvidia for potential antitrust violations related to its AI chip market dominance.

Apple plans to use OLED screens in all iPhone 16 models, moving away from Japanese suppliers and introducing new AI features.

Amazon S3 has introduced conditional writes to prevent overwriting existing objects, simplifying data updates for developers.

Chinese scientists have developed a hydrogel that shows promise in treating osteoarthritis by restoring cartilage lubrication.

Nvidia's CEO is working to position the Nvidia as a comprehensive provider for data center needs, amidst growing competition from AMD and Intel.

OpenAI

Nvidia Blackwell

Amazon is set to release a revamped Alexa voice assistant in October, powered by AI models from Anthropic's Claude, and will be offered as a paid subscription service.